Enhancing access to the Bibliome: the TREC 2004 Genomics Track

نویسندگان

  • William R Hersh
  • Ravi Teja Bhupatiraju
  • Laura Ross
  • Phoebe Roberts
  • Aaron M Cohen
  • Dale F Kraemer
چکیده

BACKGROUND The goal of the TREC Genomics Track is to improve information retrieval in the area of genomics by creating test collections that will allow researchers to improve and better understand failures of their systems. The 2004 track included an ad hoc retrieval task, simulating use of a search engine to obtain documents about biomedical topics. This paper describes the Genomics Track of the Text Retrieval Conference (TREC) 2004, a forum for evaluation of IR research systems, where retrieval in the genomics domain has recently begun to be assessed. RESULTS A total of 27 research groups submitted 47 different runs. The most effective runs, as measured by the primary evaluation measure of mean average precision (MAP), used a combination of domain-specific and general techniques. The best MAP obtained by any run was 0.4075. Techniques that expanded queries with gene name lists as well as words from related articles had the best efficacy. However, many runs performed more poorly than a simple baseline run, indicating that careful selection of system features is essential. CONCLUSION Various approaches to ad hoc retrieval provide a diversity of efficacy. The TREC Genomics Track and its test collection resources provide tools that allow improvement in information retrieval systems.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhancing Access to the Bibliome: The TREC Genomics Track

The growing amount of scientific discovery in genomics and related biomedical disciplines has led to a corresponding increase in the amount of on-line data and information. A new challenge for biomedical researchers has been how to access and manage this ever-increasing quantity of information. The Text Retrieval Conference (TREC) has implemented a Genomics Track to create an experimental envir...

متن کامل

TREC Genomics 2004

The TREC Genomics track started in 2003 as the first domain specific track of the Text Retrieval Competition. The aim of the track is to develop various IR tasks specific to the biomedical field. One task of the first year involved the retrieval of documents given a specific gene, while the second task required the extraction a brief description of gene function from documents. This year sees a...

متن کامل

RMIT University at TREC 2004

RMIT University participated in two tracks at TREC 2004: Terabyte and Genomics, both for the first time. This paper describes the techniques we applied and our experiments in both tracks, and discusses the results of the genomics track runs; the terabyte track results are unavailable at the time of manuscript submission. We also describe our new zettair search engine, in use for the first time ...

متن کامل

DIMACS at the TREC 2005 Genomics Track

This report describes DIMACS work on the text categorization task of the TREC 2005 Genomics track. Our approach to this task was similar to the triage subtask studied in the TREC 2004 Genomics track. We applied Bayesian logistic regression and achieved good effectiveness on all categories. 1. TEXT CATEGORIZATION TASK The Mouse Genome Informatics (MGI) project of the Jackson Laboratory provides ...

متن کامل

DIMACS at the TREC 2004 Genomics Track

DIMACS participated in the text categorization and ad hoc retrieval tasks of the TREC 2004 Genomics track. For the categorization task, we tackled the triage and annotation hierarchy subtasks. 1. TEXT CATEGORIZATION TASK The Mouse Genome Informatics (MGI) project of the Jackson Laboratory provides data on the genetics, genomics, and biology of the laboratory mouse. In particular, the Mouse Geno...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Journal of Biomedical Discovery and Collaboration

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2006